Cross-modal Search for Fashion Attributes

نویسندگان

  • Katrien Laenen
  • Susana Zoghbi
چکیده

In this paper we develop a neural network which learns intermodal representations for fashion attributes to be utilized in a cross-modal search tool. Our neural network learns from organic e-commerce data, which is characterized by clean image material, but noisy and incomplete product descriptions. First, we experiment with techniques to segment ecommerce images and their product descriptions into respectively image and text fragments denoting fashion attributes. Here, we propose a rule-based image segmentation approach which exploits the cleanness of e-commerce images. Next, we design an objective function which encourages our model to induce a common embedding space where a semantically related image fragment and text fragment have a high inner product. This objective function incorporates similarity information of image fragments to obtain better intermodal representations. A key insight is that similar looking image fragments should be described with the same text fragments. We explicitly require this in our objective function, and as such recover information which was lost due to noise and incompleteness in the product descriptions. We evaluate the inferred intermodal representations in cross-modal search. We demonstrate that the neural network model trained with our objective function on image fragments acquired with our rule-based segmentation approach improves the results of image search with textual queries by 198% for recall@1 and by 181% for recall@5 compared to results obtained by a state-of-the-art image search system on the same benchmark dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Modal Fashion Search

In this demo we focus on cross-modal (visual and textual) e-commerce search within the fashion domain. Particularly, we demonstrate two tasks: 1) given a query image (without any accompanying text), we retrieve textual descriptions that correspond to the visual attributes in the visual query; and 2) given a textual query that may express an interest in specific visual characteristics, we retrie...

متن کامل

Capacitated Single Allocation P-Hub Covering Problem in Multi-modal Network Using Tabu Search

The goals of hub location problems are finding the location of hub facilities and determining the allocation of non-hub nodes to these located hubs. In this work, we discuss the multi-modal single allocation capacitated p-hub covering problem over fully interconnected hub networks. Therefore, we provide a formulation to this end. The purpose of our model is to find the location of hubs and the ...

متن کامل

Coordinate Discrete Optimization for Efficient Cross-View Image Retrieval

Learning compact hash codes has been a vibrant research topic for large-scale similarity search owing to the low storage cost and expedited search operation. A recent research thrust aims to learn compact codes jointly from multiple sources, referred to as cross-view (or cross-modal) hashing in the literature. The main theme of this paper is to develop a novel formulation and optimization schem...

متن کامل

Visual Fashion-Product Search at SK Planet

We build a large-scale visual search system which finds similar product images given a fashion item. Defining similarity among arbitrary fashion-products is still remains a challenging problem, even there is no exact ground-truth. To resolve this problem, we define more than 90 fashion-related attributes, and combination of these attributes can represent thousands of unique fashion-styles. The ...

متن کامل

Cross-modal description of sentiment information embedded in speech

Looking for new possibilities to describe the information embedded in speech, we have carried out sentiment correlation analysis between speech features and color attributes. Using single vowel utterances with different prosody and sound pressure level, we have asked subjects to select colors based on their perceptual impressions after listening them. By analyzing selected color attributes usin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017